Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add SSL options to mssql #33071

Merged
merged 9 commits into from
Jan 4, 2024
Merged

Conversation

stephane-airbyte
Copy link
Contributor

@stephane-airbyte stephane-airbyte commented Dec 4, 2023

This is adding extra support for SSL options in our MS SQL source.

The MSSQL options are a bit different than postgres and MySql,so I'm not using their common logic. The flip side is that this change is not breaking compatibility!
We're basically adding the ability to send a certificate. (of the server or of the CA that signed the server certificate), and the ability to override the hostname used to connect in order to match the hostname of the server certificate.

While doing those changes, I also noticed that the password was actually mandatory in MsSqlSource, even though it wasn't marked as such in the spec.jsob.

The tests try to go the extra mile by connecting to the SQL server by IP rather than hostname, which triggers a failure to validate the certificate's host. And There's an extra test that connects by IP and passes the certificate's hostname in the spec, so the loop is closed.

On minor thing to note is that the MS SQL JDBC driver expects a keystore (technically a keystore file path), but we can't pass those through the current UI (because a key store is not a text file). So instead, we're passing a certificate, and the MsSqlSource converts that into a keystore file that is materialized in the container and its path is passed to the JDBC driver

tests are passing, connector version is bumpted

Resolves #30004

Copy link

vercel bot commented Dec 4, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Jan 4, 2024 7:20pm

Copy link
Contributor Author

stephane-airbyte commented Dec 4, 2023

Copy link
Contributor

github-actions bot commented Dec 4, 2023

Before Merging a Connector Pull Request

Wow! What a great pull request you have here! 🎉

To merge this PR, ensure the following has been done/considered for each connector added or updated:

  • PR name follows PR naming conventions
  • Breaking changes are considered. If a Breaking Change is being introduced, ensure an Airbyte engineer has created a Breaking Change Plan.
  • Connector version has been incremented in the Dockerfile and metadata.yaml according to our Semantic Versioning for Connectors guidelines
  • You've updated the connector's metadata.yaml file any other relevant changes, including a breakingChanges entry for major version bumps. See metadata.yaml docs
  • Secrets in the connector's spec are annotated with airbyte_secret
  • All documentation files are up to date. (README.md, bootstrap.md, docs.md, etc...)
  • Changelog updated in docs/integrations/<source or destination>/<name>.md with an entry for the new version. See changelog example
  • Migration guide updated in docs/integrations/<source or destination>/<name>-migrations.md with an entry for the new version, if the version is a breaking change. See migration guide example
  • If set, you've ensured the icon is present in the platform-internal repo. (Docs)

If the checklist is complete, but the CI check is failing,

  1. Check for hidden checklists in your PR description

  2. Toggle the github label checklist-action-run on/off to re-run the checklist CI.

case "unencrypted" -> additionalParameters.add("encrypt=false");
case "unencrypted" -> {
additionalParameters.add("encrypt=false");
additionalParameters.add("trustServerCertificate=true");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only added for clarity. By default, encrypt=false implies trustServerCertificate=true.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trustServerCertificate is trust all essentially.

This will trust any server certificate. I assume for the exchange of credentials which is still encrypted event when encrypt=false (you mentioned this offline)

case "encrypted_trust_server_certificate" -> {
additionalParameters.add("encrypt=true");
additionalParameters.add("trustServerCertificate=true");
}
case "encrypted_verify_certificate" -> {
additionalParameters.add("encrypt=true");
additionalParameters.add("trustServerCertificate=false");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added for clarity. By default, encrypt=true implies trustServerCertificate=false


// trust store location code found at https://stackoverflow.com/a/56570588
final String trustStoreLocation = Optional
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was really not doing anything. If anything, it'd guarantee that the connection would fail (because with this, the sever certificate would have to be signed by one of the root authorities in the default trust store)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the old behavior is equivalent to not pushing anything as a trustStorelocation JDBC parameer

char[] pwdArray = password.toCharArray();
final File trustStoreFile;
try {
trustStoreFile = File.createTempFile("mssqlTrustStoreFile", "jks");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MSSQL driver takes a trust store file location as a parameter. Unfortunately, such a file is binary, which means it can't be uploaded from the current UI. It really doesn't matter, since a certificate is just as good, as long as we have either the server certificate that of the CA that signed the server certificate

@@ -4,7 +4,7 @@
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "MSSQL Source Spec",
"type": "object",
"required": ["host", "port", "database", "username"],
"required": ["host", "port", "database", "username", "password"],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the password is required in the parameter parsing logic in MssqlSource.java. Making it mandatory in the UI as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a breaking change? What is the expected behavior here?
I'm thinking of existing configs for example saved without a password

@@ -90,7 +90,7 @@
{
"title": "Encrypted (verify certificate)",
"description": "Verify and use the certificate provided by the server.",
"required": ["ssl_method", "trustStoreName", "trustStorePassword"],
"required": ["ssl_method"],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless I'm missing something, the previous required fields didn't even exist in the spec.json. Removing them...

@stephane-airbyte stephane-airbyte force-pushed the stephane/12-02-fix_MssqlSource.java branch 3 times, most recently from bd13cbd to 2bee629 Compare December 6, 2023 22:00
@@ -3,4 +3,5 @@
"port": 5555,
"database": "default",
"username": "default"
"password": "default"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that the password is a mandatory field, it needs to be added here too

super(testDatabase);
with(JdbcUtils.JDBC_URL_PARAMS_KEY, "loginTimeout=2");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this allows us to reduce the test running times from 1m to 2seconds for the tests that are expecting a failure to connect to the DB!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we always want to set this for all test cases, though? I don't have a problem with it. I took a more conservative approach with maybeSetShorterConnectionTimeout but that conservatism may not have been warranted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no problems with it so far...

@stephane-airbyte stephane-airbyte marked this pull request as ready for review December 6, 2023 22:06
@stephane-airbyte stephane-airbyte requested a review from a team as a code owner December 6, 2023 22:06
@stephane-airbyte stephane-airbyte force-pushed the stephane/12-02-fix_MssqlSource.java branch from 8764627 to e5150a9 Compare December 7, 2023 01:25
case "unencrypted" -> additionalParameters.add("encrypt=false");
case "unencrypted" -> {
additionalParameters.add("encrypt=false");
additionalParameters.add("trustServerCertificate=true");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trustServerCertificate is trust all essentially.

This will trust any server certificate. I assume for the exchange of credentials which is still encrypted event when encrypt=false (you mentioned this offline)

Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand all of this but for what I do, I don't have any significant objections.

@@ -9,7 +9,7 @@ data:
connectorSubtype: database
connectorType: source
definitionId: b5ea17b1-f170-46dc-bc31-cc744ca984c1
dockerImageTag: 3.1.0
dockerImageTag: 3.1.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI is going to complain about a missing release note.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably go to the latest 3.5.0

}

@ParameterizedTest
@EnumSource(CertificateKey.class)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yesss! Though perhaps having two enums, one for valid and another for invalid certificates, might be preferable for expressing these tests.

.build();
try {
AirbyteCatalog catalog = new MssqlSource().discover(config);
assertTrue(certificateKey.isValid);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's just me but I would find some branching logic if (certificateKey.isValid) { ... easier to read here. Do I understand correctly that we expect discover to throw iff !isValid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no we expect discover to work for valid certificates, and fail for invalid ones. I will split this test in 2 to be clearer

public void withSslCertificates(MSSQLServerContainer<?> container) {
// yes, this is uglier than sin. The reason why I'm doing this is because there's no command to
// reload a SqlServer config. So I need to create all the necessary files before I start the
// SQL server. Hence this horror
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever it takes!

EOF
} && /opt/mssql/bin/sqlservr
""",
"{hostName}", container.getHost());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine by me but perhaps using an env var to inject the host name would be simpler? Then again perhaps not, I don't know. Just a thought.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I guess that would work too. Even though I have more refactors in mind.

super(testDatabase);
with(JdbcUtils.JDBC_URL_PARAMS_KEY, "loginTimeout=2");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we always want to set this for all test cases, though? I don't have a problem with it. I took a more conservative approach with maybeSetShorterConnectionTimeout but that conservatism may not have been warranted.

@stephane-airbyte stephane-airbyte force-pushed the stephane/12-02-fix_MssqlSource.java branch 2 times, most recently from e9d0acb to 19bf49f Compare December 19, 2023 17:27
@stephane-airbyte stephane-airbyte force-pushed the stephane/12-02-fix_MssqlSource.java branch from 19bf49f to 366b329 Compare December 19, 2023 18:00
@stephane-airbyte stephane-airbyte changed the base branch from master to stephane/12-19-move_mssql_to_latest_CDK_remove_LEGACY_state December 19, 2023 18:19
@stephane-airbyte stephane-airbyte force-pushed the stephane/12-02-fix_MssqlSource.java branch 3 times, most recently from 7901560 to d5e1af5 Compare December 21, 2023 00:31
@octavia-squidington-iii octavia-squidington-iii added the area/documentation Improvements or additions to documentation label Dec 21, 2023
@@ -1 +1 @@
version=0.8.0
version=0.8.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sure mssql points to the latest - 0.8.2(?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (0.10.4)

Copy link
Contributor

@rodireich rodireich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Stephane.
Just make sure all versions are aligned before merging,

@stephane-airbyte stephane-airbyte force-pushed the stephane/12-02-fix_MssqlSource.java branch from d5e1af5 to ed91686 Compare December 21, 2023 17:59
@katmarkham katmarkham removed the request for review from a team January 3, 2024 18:17
Base automatically changed from stephane/12-07-don_t_use_MsSQLConfigBuilder.withSsl_with_strings_but_instead_properly_named_and_typed_methods to master January 4, 2024 17:52
@postamar
Copy link
Contributor

postamar commented Jan 4, 2024

/publish-java-cdk

🕑 https://github.com/airbytehq/airbyte/actions/runs/7413665676
✅ Successfully published Java CDK version=0.10.4!

@postamar postamar merged commit 15490f6 into master Jan 4, 2024
25 checks passed
@postamar postamar deleted the stephane/12-02-fix_MssqlSource.java branch January 4, 2024 19:50
@postamar
Copy link
Contributor

postamar commented Jan 4, 2024

Updated and merged as discussed during team standup earlier this week.

jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation CDK Connector Development Kit connectors/source/mssql
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Source MSSQL (SQL Server) SSL implementation not complete
4 participants